Entity based Q&A Retrieval

نویسنده

Amit Singh

چکیده

Bridging the lexical gap between the user’s question and the question-answer pairs in the Q&A archives has been a major challenge for Q&A retrieval. State-of-the-art approaches address this issue by implicitly expanding the queries with additional words using statistical translation models. While useful, the effectiveness of these models is highly dependant on the availability of quality corpus in the absence of which they are troubled by noise issues. Moreover these models perform word based expansion in a context agnostic manner resulting in translation that might be mixed and fairly general. This results in degraded retrieval performance. In this work we address the above issues by extending the lexical word based translation model to incorporate semantic concepts (entities). We explore strategies to learn the translation probabilities between words and the concepts using the Q&A archives and a popular entity catalog. Experiments conducted on a large scale real data show that the proposed techniques are promising.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Question Answering with LCC's CHAUCER-2 at TREC 2007

In TREC 2007, Language Computer Corporation explored how a new, semantically-rich framework for information retrieval could be used to boost the overall performance of the answer extraction and answer selection components featured in its CHAUCER-2 automatic question-answering (Q/A) system. By replacing the traditional keyword-based retrieval system used in (Hickl et al. 2006c) with a new indexi...

متن کامل

Category-Based Query Modeling for Entity Search

• Entity ranking: topic consists of a keyword query (Q) and target categories (C) • List completion: the topic also specifies example entities (E) • Users often look for specific entities instead of documents mentioning them • Entities represented by their Wikipedia page • Introduction a general probabilistic framework for entity retrieval • Focus on the use of category information in a theoret...

متن کامل

Bangor at TREC 2003: Q&A and Genomics Tracks

We present the QITEKAT Question-Answering system based on the conceptual theory of Knowing About Knowledge, which adopts an agent-based approach to extract information from suitable corpora. The components of the QITEKAT system entered by the School of Informatics, University of Wales, Bangor, in the 2003 Text Retrieval Conference are described in detail. We describe PPM compression techniques ...

متن کامل

A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval

Sentence retrieval is a very important part of question answering systems. Term clustering, in turn, is an effective approach for improving sentence retrieval performance: the more similar the terms in each cluster, the better the performance of the retrieval system. A key step in obtaining appropriate word clusters is accurate estimation of pairwise word similarities, based on their tendency t...

متن کامل

پیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی

Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Entity based Q&A Retrieval

نویسنده

چکیده

منابع مشابه

Question Answering with LCC's CHAUCER-2 at TREC 2007

Category-Based Query Modeling for Entity Search

Bangor at TREC 2003: Q&A and Genomics Tracks

A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval

پیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی

عنوان ژورنال:

اشتراک گذاری